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Influenza virus can rapidly change its antigenicity, via mutation in the hemagglutinin (HA) protein, to evade host immunity. 
The emergence of the novel human-infecting avian H7N9 virus in China has caused widespread concern. However, evolution 
of the antigenicity of this virus is not well understood. Here, we inferred the antigenic epitopes of the HA protein from all H7 
viruses, based on the five well-characterized HA epitopes of the human H3N2 virus. By comparing the two major H7 phylo- 
genetic lineages, i.e., the Eurasian lineage and the North American lineage, we found that epitopes A and B are more frequent- 
ly mutated in the Eurasian lineage, while epitopes B and C are more frequently mutated in the North American lineage. Fur- 
thermore, we found that the novel H7N9 virus (derived from the Eurasian lineage) isolated in China in the year 2013, contains 
six frequently mutated sites on epitopes that include site 135, which is located in the receptor binding domain. This indicates 
that the novel H7N9 virus that infects human may already have been subjected to gradual immune pressure and recep- 
tor-binding variation. Our results not only provide insights into the antigenic evolution of the H7 virus but may also help in the 


selection of suitable vaccine strains. 
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The avian influenza A virus of the H7 subtype is classified 
as a low pathogenic avian influenza (LPAJ) virus and has 
caused sporadic human infections in recent years [1,2]. In 
the spring of 2013, an outbreak of the novel H7N9 virus 
occurred in a poultry in east China [3—6] that later caused 
human infections and deaths [4]. Based on two major se- 
quential re-assortments with the H9N2 virus, we inferred 
that the virus originated from both wild birds and domestic 
poultry [7]. Recently, Lam et al. [8] analyzed H7N9 viruses 
that had emerged between December 2013 and April 2014, 
and found that H7N9 had spread from eastern to southern 
China, generating multiple distinct lineages. The rapid evo- 
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lution of this virus caused widespread concern, due to its 
changing pathogenesis, host adaptation [9,10], and antigen- 
icity. 

Hemagglutinin (HA), the main antigen and major surface 
protein of the influenza A virus, is key to the process of 
infection caused by this virus. It is the primary target for 
neutralizing antibodies, which further inhibits attachment of 
the virus to the target cells and subsequent membrane fusion 
[11]. In order to evade surveillance by the host’s immune 
system, the influenza virus has gained the ability to change 
the antigenic properties of HA through mutation or 
re-assortment of the corresponding gene. Usually, HA anti- 
genic properties are determined by a group of residues clus- 
tered into regions called epitopes. More specifically, the HA 
epitopes are composed of amino acids that directly interact 
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with the neutralizing antibodies [12]. Thus, rapid mutation 
of HA epitopes could disrupt its interaction with antibodies, 
resulting in viral immune escape. 

Given their critical role in influenza antigenic variation 
and strain selection for flu vaccines, many studies on influ- 
enza virus have focused on HA epitopes. Human H3N2 
virus HA epitopes have been well characterized; they con- 
sist of five epitopes (A-E) [13-15]. Additionally, by com- 
paring the HA structures, the H3 HA epitopes were used to 
define the epitopes for H1 and H2 HA [16]. More recently, 
using computational integration, we have formed a com- 
prehensive picture of HA epitopes of highly pathogenic 
avian H5N1 viruses [17]. 

In this study, by integrating epitope regions mapped from 
well-characterized H3N2 viruses, we inferred the antigenic 
epitope regions of H7, and analyzed mutations on five in- 
ferred epitopes for both lineages of H7 viruses. We further 
investigated the mutation frequencies of the different 
epitopes during the evolution of the two H7 lineages, and 
found that, although the mutations differ between the two 
lineages, epitope B changed frequently in both lineages. 
Furthermore, we identified new mutations that occurred in 
the five inferred epitopes of the 2013 H7N9 viruses, which 
showed significant changes in the antigenicity of this new 
H7N9 virus as compared to earlier H7N9 viruses. These 
findings yield insight into the antigenicity of H7N9 and may 
facilitate surveillance of influenza H7N9 and vaccine selec- 
tion. 


1 Materials and methods 
1.1 Data preparation 


Full-length HA sequences of human influenza H3N2 viruses, 
avian influenza H7 viruses, and H7N9 viruses were ob- 
tained from the NCBI Influenza Virus Resource [18] on 
March 16, 2015. Laboratory-derived and identical HA se- 
quences were removed. For each subtype, the HA sequences 
were aligned using MUSCLE software [19], and the align- 
ments were checked manually. After alignment, the signal 
peptide regions were removed and the sequences with X 
character content exceeding 10% were eliminated. This re- 
sulted in the inclusion of 611 HA1 sequences for H7 and 
3343 HA1 sequences for human H3N2. We obtained 467 
and 87 HAI sequences for avian and human-infecting 
H7N9 viruses, which were collected since the 2013 out- 
break. The structural data for H7 HA (A/NETHERLANDS/ 
219/2003(H7N7), PDB ID is 4DJ8), H3N2 HA (strain 
A/AICHI/2/1968(H3N2), PDB ID: SHMG), HINI HA 
(A/Puerto Rico/8/1934(H1N1), PDB ID: 1RU7), H2N2 HA 
(A/Japan/305/1957(H2N2), PDB ID: 3KU3), and HSN1 HA 
(A/Indonesia/5/2005(H5N1), PDB ID: 4K64) were down- 
loaded from the Protein Data Bank (PDB) database 
(http://www.rcsb.org/pdb/home/home.do). 
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1.2 Site entropy 


Information entropy for each site was computed using the 
method described by Wiley and Skehel [15]. For each posi- 
tion, the information entropy was further normalized by the 
base entropy (natural logarithm of 20), which was computed 
by assuming even distribution of the 20 amino acids. To 
illustrate the differences among sites more clearly, we de- 
fined relative entropy, which was calculated as the ratio of 
information entropy for each site to the average entropy. 


1.3 Mapping the epitope from H3 HA to H7 HA 


The crystal structure of H7 HA (A/NETHERLANDS/219/ 
2003(H7N7), PDB ID: 4DJ8) was aligned to that of H3 HA 
(strain A/AICHI/2/1968(H3N2), PDB ID: 5HMG) by using 
TM-align [20]. Then, the sites in H7 HA corresponding to 
those of the A-E epitopes in H3 were defined as candidate 
antigenic sites for H7. Since the antigenic regions are 
thought to be exposed, only the surface residues of the H7 
structure 4DJ8 were retained. A residue was identified as 
exposed when the Accessible Surface Area (ASA), which 
was calculated using the NACCESS program [21] based on 
the HA trimetric complex of 4DJ8, exceeded 1 A’. 


1.4 Construction of the phylogenetic tree 


A phylogenetic tree of all H7 and H7N9 sequences was 
constructed by the neighbor-joining (NJ) method using the 
PHYLIP package [22]. 


2 Results 
2.1 Inferring five epitopes of H7 based on H3 


Previous studies showed that influenza viruses with different 
HA subtypes may share similar antigenic structures [16,23,24]. 
Moreover, the HAs of H7 and H3 subtypes belonged to the 
same clade [25]. Our structural comparison showed that the 
HA structures of the H7 virus and the human H3N2 virus 
are quite similar, with a TM-score of 0.936, and a 
root-mean-square deviation (RMSD) of 1.66 (Figure 1A). 
The TM-score and RSMD of H7 were 0.882 and 2.43, when 
compared to HINI, 0.902 and 2.15 when compared to 
H2N2, and 0.886 and 2.26 when compared to H5N1. This 
suggested that H7 and H3N2 viruses shared a similar anti- 
genic structure. Furthermore, we aligned the local structures 
of each epitope of H7 and H3N2 viruses, and found the 
RMSD for epitopes A-E to be 1.20, 1.52, 1.07, 1.42, and 
1.71, respectively. Based on the structural comparison of 
HAs, we mapped the five known epitopes of human H3N2 
HA onto H7 HA (see Materials and methods). In this way, 
we inferred 130 antigenic sites for the H7 subtype: 18, 22, 
27, 41, and 22 antigenic sites for epitopes A, B, C, D, and E 
respectively (Figure S1). 
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Figure 1 Identification of antigenic epitopes for the H7 viruses. A, Five identified antigenic epitopes, labeled A-E, on H7 HA1, based on the superimposi- 
tion of H7 and H3. The HA1 backbones of H7 and H3 are colored yellow and purple, respectively. B, Circular phylogenetic tree of H7 viruses. Two major 
clades, North American lineage and Eurasian lineage, can be observed. C-E, Comparison of site entropy in antigenic epitopes of the H7 Eurasian lineage and 
North American lineage (C), H7 Eurasian lineage and H3N2 viruses (D), and H7 North American lineage and H3N2 viruses (E). 


2.2 Mutational profiles of HA from H7 Eurasian and 
North American lineages 


As shown in the phylogenetic tree (Figure 1B), sequences 
of H7 mainly clustered into two lineages, i.e., the 
Eurasian lineage and the North American lineage. Thus, 
we compared the similarities and differences of mutational 
profiles of HA1 among the H7 Eurasian lineage, the H7 
North American lineage, and human H3N2 viruses (the lat- 
ter was used as a reference). Mutational profiles were rep- 
resented with relative entropies on all HA1 sites. As ex- 
pected, the mutational profiles of HA1 for the H7 Eurasian 
lineage was significantly correlated with that of the H7 
North American lineage, with a Pearson’s correlation coef- 
ficient of 0.51 (P<2.2x10", Figure 1C). This showed that 
HA1 of both lineages experienced similar selection pres- 
sures. Pearson’s correlation coefficients between the human 
H3N2 and H7 lineages exceeded 0.3 (P<1.2x10®, Figure 
1D and 1E), indicating that these were also significantly 
correlated. 


2.3 Mutations of five antigenic epitopes of H7 Eurasian 
and North American lineages 


Although the mutational profiles of HA of the H7 Eurasian 
and North American lineages presented significant similar- 
ity, some differences were also observed, especially for five 
antigenic epitopes (Figure 1C—1E). Furthermore, we com- 
pared the average entropies of the five antigenic epitopes, 
receptor-binding domains (RBD), and other sites of the two 
H7 lineages and human H3N2 viruses. Mutations in epitope 
D had relatively low frequencies, while mutations in epitope 
B had relatively high frequencies in both H7 and H3 viruses. 
Mutations in epitopes A and B of H7 Eurasian lineage were 
relatively higher than for the other three epitopes (Figure 
2B). As for the H7 North American lineage, epitope C 
showed extremely high levels of mutation rates, while 
epitopes A and D had a low level of mutation rates (Figure 
2C), which differed from the H7 Eurasian lineage. Further- 
more, we found that the RBD sites of the H3N2 virus 
showed markedly higher frequencies of mutation than did 
the two H7 lineages. 
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Figure 2 Average site entropies of five epitopes and the receptor-binding 
domain of both H7 lineages and human H3N2 viruses. A-C, Average site 
entropies of five epitopes, the receptor-binding domain (RBD), and other 
sites of the H7 Eurasian lineage (A), the H7 North American lineage (B), 
and H3N2 viruses (C). 


Furthermore, we attempted to identify sites with rela- 
tively high mutation rates in the receptor-binding domain. 
Those sites with entropies greater than the average entropy 
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were regarded as high mutation-rate sites, and otherwise as 
low mutation-rate sites. In total, there were 15, 8, and 4 high 
mutation-rate sites for human H3N2, H7 Eurasian lineage, 
and H7 North American lineage viruses, respectively. The 
common high mutation-rate sites were positions 135, 158, 
and 193, while positions 98, 134, 153, 183, 194, 195, 224, 
and 228 (numbering in H3) were relatively conservative 
sites for the human H3N2 virus and the two H7 lineages 
(Figure 3). 


2.4 Mutations in HA of H7N9 viruses isolated in China 
from 2013 to 2014 and in the previously reported avian 
H7N9 virus 


Some sites in the RBD are highly conserved among avian 
viruses, while these sites bear significant substitutions in 
human viruses [26]. By comparing the human-infecting 
H7N9 viruses isolated in China between 2013 and 2014 to 
the previously reported avian H7N9 viruses, we discovered 
that most amino acids were conserved in the RBD, while 
positions 186, 189, and 226 differed (H3 numbering; Figure 
2; Table 1). Substitution G186V had already reported to 
influence receptor-binding specificity in Eurasian H7 virus- 
es [27], while Q226L had previously been reported to in- 
crease the binding affinity to the human receptor for H7 HA 
[28]. Some other mutations were also observed in other in- 
ferred antigenic sites, such as position 122 in epitope A, 
position 312 in epitope C, and positions 174 and 179 in 
epitope D (H3 numbering; Table 1). Furthermore, we com- 
pared human-infecting H7N9 viruses to avian H7N9 viruses 
isolated in mainland China from 2013 to 2014 (Table 1). 
Most sites were identical, except for position 57, which was 
located on epitope E. In human-infecting H7N9 viruses, the 
majority had an R at position 57, whereas most avian influ- 
enza H7N9 had K at this position. 


2.5 Evolution of HA1 in H7N9 viruses during 2013 to 
2014 


Since the outbreak in 2013, avian H7N9 viruses have been 
circulating in China. To investigate the evolution of H7N9 


Table 1 Differences in the amino acids of hemagglutinin obtained from H7N9 viruses isolated at different times (only dominant amino acids are listed for 


each site) 
Epitope Numbering in H3 Avian H7N9 before 2013” Human-infecting H7N9 (2013-2014) Avian H7N9 after 2013” 

A 122 T A A 
B 186 G Vv V 

189 T A A 
C 312 E R R 

174 D S S 
D 179 I Vv Vv 

226 Q L L 
E 57 K K R 


a) All avian H7N9 viruses before the 2013 H7N9 outbreak. b) Avian H7N9 viruses circulating in mainland China during 2013—2014. 
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during the past two years, we mapped the dynamic changes 
in the amino acids along the phylogenetic tree (Figure 4A), 
and listed the sites with less than 95% conservation. Among 
the nine sites, sites 132 and 135 were on the inferred epitope 
A, site 312 was on the inferred epitope C, site 177 was on 
the inferred epitope D, and sites 57 and 59 were on the in- 
ferred epitope E, while site 135 was in the RBD. Further- 
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more, we investigated the amino acid differences between 
human-infecting H7N9 viruses and avian H7N9 viruses 
(Figure 4B). Avian H7N9 virus demonstrated mutation at 
all nine sites, while in human-infecting H7N9 viruses, sites 
59, 114, 177, and 255 remained quite conserved. During the 
early stage of the H7N9 outbreak, most sites remained con- 
served during March to May of 2013. However, from Janu- 
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Figure 4 Evolution of H7N9 viruses from 2013 to 2014. A, Phylogenetic tree of H7N9 viruses isolated in China during 2013-2014, dynamic changes in 
amino acids are shown along the tree (only sites with conservation less than 95% were listed). B, Dynamic changes in amino acids over time are displayed. 
Avian H7N9 viruses are shown on the left, human-infecting H7N9 viruses are shown on the right. *: Site 276+1 means the gap between sites 276 and 277 
(H3 numbering) in TM-align. 
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ary to March of 2014, all nine sites underwent significant 
mutation in the avian H7N9 viruses. Most altered amino 
acids in the human-infecting viruses were consistent with 
those in avian viruses, except for site 135, which had an S 
or T substitution, forming an N-linked glycosylation site. 
The A135T substitution of H7N9 had already been reported 
by Xu et al. [29], and had also been reported as a genetic 
marker for mammalian adaptation and virulence in other 
human-infecting avian influenza viruses, such as H10ON8 [30]. 


3 Discussion 


In this study, we inferred the antigenic epitopes of the H7 
subtype influenza virus and analyzed the antigenicity in 
both H7 subtype lineages and human influenza H3N2 vi- 
ruses. Five epitopes were mapped from the human H3N2 
virus. H7 fell in the same clade as H3 in terms of HA classi- 
fication, and is very likely to share a similar antigenic 
structure with H3. Moreover, the H3 HA epitopes were used 
to define the epitopes for H1 and H2 HA [16,23]. Thus, it 
was reasonable to infer the antigenic sites of H7 based on 
antigenic sites of human H3N2. 

There are two main lineages, the Eurasian and North 
American lineages, in the phylogenetic tree of H7 viruses 
(Figure 1B). These two lineages showed different patterns 
in their antigenic sites. Epitopes A and B showed a high 
frequency of mutation in the Eurasian lineage, while 
epitopes B and C were frequently mutated in the North 
American lineage. Those results suggested that epitope B 
was immune-dominant in H7 viruses. Compared to the for- 
mer H7N9 virus, the 2013 H7N9 virus circulating in China 
had particular mutations, and most of them were located in 
the inferred epitopes. Among these substitutions, some were 
non-conserved substitutions, such as T122A, E212R, and 
D174S, which suggested that these human-infecting H7N9 
viruses may already have experienced gradual host immune 
pressure. Positions 186, 189, and 226 (in H3 numbering), 
located in the receptor-binding region, also demonstrated 
non-conserved substitutions (G186V, T189A, and Q226L), 
and the two substitution at positions 186 and 226 were re- 
ported to influence the receptor-binding activity of the H7 
virus [27,28]. The latter may indicate that human-infecting 
H7N9 viruses have also experienced certain host binding 
variation. 

We also compared dynamic changes in HA1 in hu- 
man-infecting H7N9 viruses to avian influenza H7N9 vi- 
ruses from 2013 to 2014, but found no significant differ- 
ences between these two categories of viruses. For hu- 
man-infecting H7N9, most amino acid changes were con- 
sistent with avian H7N9, supporting the continuous trans- 
mission from avian to human hosts. Interestingly, site 135 
showed S and T substitutions in the human-infecting H7N9 
virus, which resulted in formation of an N-linked glycosyla- 
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tion site. Since genetic evolution is a continuous process, 
mutations in avian influenza H7N9 should continue to be 
monitored, and particularly at those sites in known epitopes 
and in the RBD, which may further cause human-to-human 
transmission. 

In summary, we have inferred five antigenic epitopes for 
the H7 subtype influenza virus and have further analyzed 
the mutation patterns of each of these epitopes. We also 
identified new mutations in the 2013 H7N9 virus. Our find- 
ings will facilitate monitoring of antigenic changes of this 
new virus, and may enhance H7N9 surveillance and vaccine 
selection. 
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